Evaluating Clone Detection Techniques
نویسندگان
چکیده
In the last decade, several researchers have investigated techniques to detect duplicated code in programs exceeding hundreds of thousands lines of code. All of these techniques have known merits and deficiencies, but as of today, little is known on where to fit these techniques into the software maintenance process. This paper compares three representative detection techniques (simple line matching, parameterized matching, and metric fingerprints) by means of five small to medium cases and analyses the differences between the reported matches. Based on this experiment, we conclude that (1) simple line matching is best suited for a first crude overview of the duplicated code; (2) metric fingerprints work best in combination with a refactoring tool that is able to remove duplicated subroutines; (3) parameterized matching works best in combination with more fine-grained refactoring tools that work on the statement level.
منابع مشابه
Detection of abl/bcr Fusion Gene in Patients Affected by Chronic Myeloid Leukaemia by Dual-Colour Interphase Fluorescence in situ Hybridisation
Conventional cytogenetic is the standard technique for detection of Philadelphia (Ph) chromosome in chronic myeloid leukemia (CML). Evaluation of abelson murine leukemia/breakpoint cluster region (abl/bcr) fusion using dual-colour fluorescence in situ hybridization (D-FISH) is an alternative approach allowing rapid and reliable detection of the disease. We employed the technique of interphase D...
متن کاملNear-miss function clones in open source software: an empirical study
The new hybrid clone detection tool NICAD combines the strengths and overcomes the limitations of both text-based and AST-based clone detection techniques and exploits novel applications of a source transformation system to yield highly accurate identification of cloned code in software systems. In this paper, we present an in-depth study of near-miss function clones in open source software usi...
متن کاملComparison of Clone Detection Techniques
Many techniques for detecting duplicated source code (software clones) have been proposed in the software reengineering literature. However, comparison of these techniques in terms of performance is not widely studied. There are four general categories for clone detection techniques; textual, lexical, syntactic, and semantic. This report presents an experiment that evaluates different clone det...
متن کاملComparison and evaluation of code clone detection techniques and tools: A qualitative approach
Over the last decade many techniques and tools for software clone detection have been proposed. In this paper, we provide a qualitative comparison and evaluation of the current state-of-the-art in clone detection techniques and tools, and organize the large amount of information into a coherent conceptual framework. We begin with background concepts, a generic clone detection process and an ove...
متن کاملCode Clone Detection Technique Using Program Execution Traces
Code clone is a code fragment that has identical or similar fragments to it in the source code. Many code clone detection techniques and tools have been proposed. However, source code derived by copy-and-paste may be disguised by obfuscation because these techniques detect code clone using only static information such as source code or binary. Therefore, we propose a new clone detection techniq...
متن کاملA Taxonomy of Clones in Source Code: The Re–Engineers Most Wanted List
Code cloning — that is, the gratuitous duplication of source code within a software system — is an endemic problem in large, industrial systems [6, 5]. While there has been much research into techniques for clone detection and analysis, there has been relatively little empirical study on characterizing how, where, and why clones occur in industrial software systems. Our current research is to p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003